منابع مشابه
An Efficient Mechanism for Navigating Web Using Mobile Web Crawler
With the fast pace growth of World Wide Web and its dynamic nature coupled with presence of large volume of contents, the web crawlers have become an indispensable part of search engines. The growing use of search engines and their dependency in every day life necessitates that the correct and relevant information is presented to users in response to their search queries. Web crawler plays an i...
متن کاملWorld Wide Web Crawler
We describe our ongoing work on world wide web crawling, a scalable web crawler architecture that can use resources distributed world-wide. The architecture allows us to use loosely managed compute nodes (PCs connected to the Internet), and may save network bandwidth significantly. In this poster, we discuss why such architecture is necessary, point out difficulties in designing such architectu...
متن کاملReinforcement-Based Web Crawler
This paper presents a focused web crawler system which automatically creates a minority language corpora. The system uses a database of relevant and irrelevant documents testing the relevance of retrieved web documents. The system requires a starting web document to indicate where the search would begin.
متن کاملWeb Crawler Architecture
Definition A web crawler is a program that, given one or more seed URLs, downloads the web pages associated with these URLs, extracts any hyperlinks contained in them, and recursively continues to download the web pages identified by these hyperlinks. Web crawlers are an important component of web search engines, where they are used to collect the corpus of web pages indexed by the search engin...
متن کاملOptimization of Search Results with De-Duplication of Web Pages In a Mobile Web Crawler
Being in an information era, where search engines are the supreme gateways for access of information on web. The efficiency and reliability of search engines are significantly affected by the presence of large amount of duplicate content present on World Wide Web. Web storage indexes are also affected by the presence of duplicate documents over web which leads to slowing down of serving results...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Machine Learning and Computing
سال: 2012
ISSN: 2010-3700
DOI: 10.7763/ijmlc.2012.v2.182